Mining High-Dimensional Data
نویسندگان
چکیده
With the rapid growth of computational biology and e-commerce applications, high-dimensional data becomes very common. Thus, mining highdimensional data is an urgent problem of great practical importance. However, there are some unique challenges for mining data of high dimensions, including (1) the curse of dimensionality and more crucial (2) the meaningfulness of the similarity measure in the high dimension space. In this chapter, we present several state-of-art techniques for analyzing highdimensional data, e.g., frequent pattern mining, clustering, and classification. We will discuss how these methods deal with the challenges of high
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملCalculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms
The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملPerformance evaluation of chain saw machines for dimensional stones using feasibility of neural network models
Prediction of the production rate of the cutting dimensional stone process is crucial, especially when chain saw machines are used. The cutting dimensional rock process is generally a complex issue with numerous effective factors including variable and unreliable conditions of the rocks and cutting machines. The Group Method of Data Handling (GMDH) type of neural network and Radial Basis Functi...
متن کاملTCMiner: A High Performance Data Mining System for Multi-dimensional Data Analysis of Traditional Chinese Medicine Prescriptions
This paper introduces the architecture and algorithms of TCMiner: a high performance data mining system for multi-dimensional data analysis of Traditional Chinese Medicine prescriptions. The system has the following competing advantages: (1) High Performance (2) Multi-dimensional Data Analysis Capability (3) High Flexibility (4) Powerful Interoperability (5) Special Optimization for TCM. This d...
متن کاملOutlier detection for high dimensional data pdf
Is particularly useful for high dimensional data where outliers cannot be found.High dimensional data in Euclidean space pose special challenges to data. In about just the last few years, the task of unsupervised outlier detection has found.Outlier detection is an outstanding data mining task referred to open pdf with mac word class="text" href="https://tokiqivy.files.wordpress.com/2015/06/opel...
متن کامل